-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: fix+test quantile with empty DataFrame, closes #23925 #27436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/tests/frame/test_quantile.py
Outdated
tm.assert_series_equal(result, expected) | ||
|
||
result = df.quantile([0.5]) | ||
expected = pd.DataFrame([], index=[], columns=[0.5]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand this expected. Should index
and columns
be flipped (and in the implementation too)?
With DataFrame.quantlie([0.5])
the
- index is the
q
([0.5]
) - columns are the (numeric) columns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right, yah. will update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do we do in the scalar-q case then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think
In [11]: pd.Series(name=0.5)
Out[11]: Series([], Name: 0.5, dtype: float64)
right? So that it's equivalent to
In [18]: pd.DataFrame({"A": [1]}).quantile().drop("A")
Out[18]: Series([], Name: 0.5, dtype: float64)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm ex small comment, needs a whatsnew note (1.0)
# GH#23925 _get_numeric_data may drop all columns | ||
df = pd.DataFrame(pd.date_range("1/1/18", periods=5)) | ||
result = df.quantile(0.5) | ||
expected = pd.Series([], index=[], name=0.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a name to the index of the frame which I think should propagate here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. i think it should be df.columns.name that propagates.
It looks like DataFrame_get_numeric_data isn't retaining the columns names, will open a separate issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also I pushed the 1.0 whatsnew if you rebase
doc/source/whatsnew/v1.0.0.rst
Outdated
@@ -193,6 +193,7 @@ ExtensionArray | |||
Other | |||
^^^^^ | |||
|
|||
- Bug in :meth:`DataFrame.quantile` with zero-column :class:`DataFrame` incorrectly raising (:issue:`23925`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this to numeric & remove the Other section so its not used anymore.
@@ -467,3 +467,17 @@ def test_quantile_empty(self): | |||
|
|||
# FIXME (gives NaNs instead of NaT in 0.18.1 or 0.19.0) | |||
# res = df.quantile(0.5, numeric_only=False) | |||
|
|||
def test_quantile_empty_no_columns(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be worth testing this with timedelta & period data as well (just to confirm that they work), but can do followup.
thanks @jbrockmendel |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff